AITopics | action variable

Collaborating Authors

action variable

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

8cbe9ce23f42628c98f80fa0fac8b19a-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 21:28:58 GMT

algorithm, belief state, representation, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.34)

Add feedback

From Stochastic Planning to Marginal MAP

Hao(Jackson) Cui, Radu Marinescu, Roni Khardon

Neural Information Processing SystemsFeb-12-2026, 20:11:32 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, graph, node, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Medford (0.04)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
North America > Canada (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Add feedback

From Stochastic Planning to Marginal MAP

Hao(Jackson) Cui, Radu Marinescu, Roni Khardon

Neural Information Processing SystemsNov-20-2025, 16:26:30 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Medford (0.04)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
North America > Canada (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Add feedback

Sampling Networks and Aggregate Simulation for Online POMDP Planning

Hao(Jackson) Cui, Roni Khardon

Neural Information Processing SystemsOct-3-2025, 04:46:33 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, belief state, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.46)

Add feedback

8cbe9ce23f42628c98f80fa0fac8b19a-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 04:46:18 GMT

algorithm, belief state, representation, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.34)

Add feedback

Reinforcement Learning for Control with Multiple Frequencies Jongmin Lee 1, Byung-Jun Lee

Neural Information Processing SystemsOct-2-2025, 10:53:35 GMT

Finally, we define the c-persistent policy π as follows: Definition 1.

action persistence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.68)

Industry: Transportation (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning Joint Interventional Effects from Single-Variable Interventions in Additive Models

Kekić, Armin, Mejia, Sergio Hernan Garrido, Schölkopf, Bernhard

arXiv.org Machine LearningJun-6-2025

Estimating causal effects of joint interventions on multiple variables is crucial in many domains, but obtaining data from such simultaneous interventions can be challenging. Our study explores how to learn joint interventional effects using only observational data and single-variable interventions. We present an identifiability result for this problem, showing that for a class of nonlinear additive outcome mechanisms, joint effects can be inferred without access to joint interventional data. We propose a practical estimator that decomposes the causal effect into confounded and unconfounded contributions for each intervention variable. Experiments on synthetic data demonstrate that our method achieves performance comparable to models trained directly on joint interventional data, outperforming a purely observational estimator.

artificial intelligence, learning joint interventional effect, machine learning, (11 more...)

arXiv.org Machine Learning

2506.04945

Country:

Europe > Germany (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Active Causal Structure Learning with Latent Variables: Towards Learning to Detour in Autonomous Robots

Riscos, Pablo de los, Corbacho, Fernando

arXiv.org Artificial IntelligenceOct-28-2024

Artificial General Intelligence (AGI) Agents and Robots must be able to cope with everchanging environments and tasks. They must be able to actively construct new internal causal models of their interactions with the environment when new structural changes take place in the environment. Thus, we claim that active causal structure learning with latent variables (ACSLWL) is a necessary component to build AGI agents and robots. This paper describes how a complex planning and expectation-based detour behavior can be learned by ACSLWL when, unexpectedly, and for the first time, the simulated robot encounters a sort of "transparent" barrier in its pathway towards its target. ACSWL consists of acting in the environment, discovering new causal relations, constructing new causal models, exploiting the causal models to maximize its expected utility, detecting possible latent variables when unexpected observations occur, and constructing new structures - internal causal models and optimal estimation of the associated parameters, to be able to cope efficiently with the new encountered situations. That is, the agent must be able to construct new causal internal models that transform a previously unexpected and inefficient (sub-optimal) situation, into a predictable situation with an optimal operating plan.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.20894

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Spain > Galicia > Madrid (0.04)
(7 more...)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

Reinforcement Learning in High-frequency Market Making

Zheng, Yuheng, Ding, Zihan

arXiv.org Machine LearningAug-12-2024

This paper establishes a new and comprehensive theoretical analysis for the application of reinforcement learning (RL) in high-frequency market making. We bridge the modern RL theory and the continuous-time statistical models in high-frequency financial economics. Different with most existing literature on methodological research about developing various RL methods for market making problem, our work is a pilot to provide the theoretical analysis. We target the effects of sampling frequency, and find an interesting tradeoff between error and complexity of RL algorithm when tweaking the values of the time increment $\Delta$ $-$ as $\Delta$ becomes smaller, the error will be smaller but the complexity will be larger. We also study the two-player case under the general-sum game framework and establish the convergence of Nash equilibrium to the continuous-time game equilibrium as $\Delta\rightarrow0$. The Nash Q-learning algorithm, which is an online multi-agent RL method, is applied to solve the equilibrium. Our theories are not only useful for practitioners to choose the sampling frequency, but also very general and applicable to other high-frequency financial decision making problems, e.g., optimal executions, as long as the time-discretization of a continuous-time markov decision process is adopted. Monte Carlo simulation evidence support all of our theories.

algorithm, equilibrium, nash equilibrium, (16 more...)

arXiv.org Machine Learning

2407.21025

Genre: Research Report > New Finding (0.46)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback

Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction

Kwak, Yunhyeok, Hwang, Inwoo, Kim, Dooyoung, Lee, Sanghack, Zhang, Byoung-Tak

arXiv.org Artificial IntelligenceJun-2-2024

Monte Carlo Tree Search (MCTS) has showcased its efficacy across a broad spectrum of decision-making problems. However, its performance often degrades under vast combinatorial action space, especially where an action is composed of multiple sub-actions. In this work, we propose an action abstraction based on the compositional structure between a state and sub-actions for improving the efficiency of MCTS under a factored action space. Our method learns a latent dynamics model with an auxiliary network that captures sub-actions relevant to the transition on the current state, which we call state-conditioned action abstraction. Notably, it infers such compositional relationships from high-dimensional observations without the known environment model. During the tree traversal, our method constructs the state-conditioned action abstraction for each node on-the-fly, reducing the search space by discarding the exploration of redundant sub-actions. Experimental results demonstrate the superior sample efficiency of our method compared to vanilla MuZero, which suffers from expansive action space.

abstraction, action abstraction, action space, (13 more...)

arXiv.org Artificial Intelligence

2406.00614

Country: